42 research outputs found

    Statistische Analyse von Sequenzpopulationen in der Virologie und Immunologie

    Get PDF
    In this thesis I have examined various topics regarding the relationship between viruses and the human immune system. I expanded and refined a tool (which can now be found as R-package SeqFeatR on C-RAN) for the analysis of sequence data and features of this sequences like HLA type or tropism (see chapter 4) and checked with this tool if there are differences between some multiple correction approaches for sequence data, and how Bayesian inference could be used in this context (see chapter 5). It could be shown that Bayesian inference is superior to the frequentistic methods for this kind of problem, because multiple correction approaches ignore the fact that different positions in a sequence alignment may be connected in the protein product of this sequence and are therefor not independent. Furthermore, I have examined sequences from HCV with a form of bootstrap algorithm to find sequence areas which can be used in unknown transmission cases in court. Two areas were found, one in the hypervariable region and the other at the end of the non-structural protein NS5B (see chapter 9). Proteasomal cleavage of alien amino acid sequences inside human cells leads to a presentation of fragments of these sequences on the surface of the cell as epitopes. To present such a fragment, not only must it bind to the MHC, but also needs to be in the correct length to be presented. Therefore viral evolution should favor those viruses, which cannot be cut into presentable epitopes. With epitope data from IEDB and predicted viral sequences which bind the MHC, I searched for amino acids inside the flanking regions around the epitope that may indicate a possible escape mutation against the proteasomal cleavage processes. Fourteen such amino acids and positions were found (see chapter 7). I created a model of HBV reverse transcriptase to check if mutations in certain positions could influence binding with the nucleotide analogue reverse transcriptase inhibitor Tenofovir. Mutations which were inside the binding pocket for Tenofovir showed, in an experimental design by the group of Mengji Lu, a decreased affinity towards the drug (see chapter 10). Together with Ralf Küppers group I examined NGS from different types of B cells to search for almost identical sequences between those. We found similar to identical sequences from two, three and even four kinds of cells in the blood samples of both donors (see chapter 6).In dieser Dissertation bearbeitete ich verschiedene Themen aus dem Bereich der humanpatho-genen Viren und des menschlichen Immunsystems. Zu diesem Zweck entwarf ich ein Programm (welches auf dem R-Archiv C-RAN unter dem Namen SeqFeatR zu finden ist) mit dem sich der Zusammenhang zwischen Sequenzdaten und spezifischen Eigenschaften, wie etwa HLA Typ oder Tropismus, analysieren läßt (s.h Kapitel 4). Mit diesem Programm untersuchte ich ob ein Unterschied zwischen den Verfahren zur Korrektur von Alphafehler-Kumulierung bei Sequenzdaten besteht und in welchem Maße die Verfahren der Bayesschen Statistik besser für diese geeignet sind (s.h. Kapitel 5). Dabei stellte sich heraus, dass letztere für diese Klasse von Problemen eher verwendet werden sollten, da Alphafehler-Kumulierungskorrekturen möglichen Abhängigkeite zwischen verschiedenen Sequen-zpositionen, welche sich unter Umständen erst im fertigen Protein offenbaren, ignorieren. Weiterhin untersuchte ich HCV Sequenzen mittels einer Variante des Bootstrap-Algorithmus um jene Sequenz-Bereiche zu finden, die im Falle von ungeklärten Übertragungswegen zur Identifizierung dieser genutzt werden können. Dabei stellten sich zwei Bereiche als besonders geeignet heraus: Die hypervariable Region sowie ein Bereich am Ende des Nicht-Struktur Protein NS5B (s.h. Kapitel 9). Die Spaltung von fremden Aminosäuresequenzen innerhalb von menschlichen Zellen durch das Proteasom kann zu einer Präsentation dieser Fragmente auf der Zelloberfläche als Epitope führen. Um solche Fragmente präsentieren zu können, müssen diese nicht nur an das spezifische MHC Molekül binden, sondern auch eine optimale Länge besitzen. Daher sollte der evolutionäre Prozess solche Viren fördern, deren Sequenzen sich nicht in entsprechende Stücke zerteilen lassen. Durch eine Kombination von Epitopdaten aus der IEDB und vorhergesagten viralen Sequenzen, welche sicher an MHC Moleküle binden, untersuchte ich, ob innerhalb der flankierenden Regionen um das jeweilige Epitop Sequenzpositionen existieren, welche auf eine Mutation hinweisen, die den Schnittmechanismus der Zelle verhindert. Ich fand vierzehn Aminosäuren und Positionen, die einen solchen Zusammenhang besitzen können (s.h. Kapitel 7). Um heraus zu finden ob es in der reversen Transkriptase von HBV Positionen gibt, welche die Bindung mit dem nukleotidischen Reverse-Transkriptase-Inhibitor Tenofovir beeinflussen, erstellte ich ein Modell dieses Enzyms. Mutationen, die innerhalb der Bindetasche für Tenofovir lagen, führten in einer Versuchsreihe von der Gruppe von Mengji Lu zu einer verringerten Affinität zw ischen Enzym und Medikament (s.h. Kapitel 10). Zusammen mit der Gruppe von Ralf Küppers untersuchte ich Hoch-Durchsatz-Sequenzdaten von verschiedenen Arten von B Zellen um ähnliche Sequenzen zu finden. Wir fanden ähnliche und sogar identische Sequenzen zwischen zwei, drei und sogar allen vier Arten von Zellen jeweils innerhalb der Blutproben jedes der beiden Spender (s.h Kapitel 6)

    Quantitative Comparison of Abundance Structures of Generalized Communities: From B-Cell Receptor Repertoires to Microbiomes

    Full text link
    The \emph{community}, the assemblage of organisms co-existing in a given space and time, has the potential to become one of the unifying concepts of biology, especially with the advent of high-throughput sequencing experiments that reveal genetic diversity exhaustively. In this spirit we show that a tool from community ecology, the Rank Abundance Distribution (RAD), can be turned by the new MaxRank normalization method into a generic, expressive descriptor for quantitative comparison of communities in many areas of biology. To illustrate the versatility of the method, we analyze RADs from various \emph{generalized communities}, i.e.\ assemblages of genetically diverse cells or organisms, including human B cells, gut microbiomes under antibiotic treatment and of different ages and countries of origin, and other human and environmental microbial communities. We show that normalized RADs enable novel quantitative approaches that help to understand structures and dynamics of complex generalize communities

    Comparative computational analysis to distinguish mesenchymal stem cells from fibroblasts

    Get PDF
    IntroductionMesenchymal stem cells (MSCs) are considered to be the most promising stem cell type for cell-based therapies in regenerative medicine. Based on their potential to home to diseased body sites following a therapeutically application, these cells could (i) differentiate then into organ-specific cell types to locally restore injured cells or, most prominently, (ii) foster tissue regeneration including immune modulations more indirectly by secretion of protective growth factors and cytokines. As tissue-resident stem cells of mesenchymal origin, these cells are morphologically and even molecularly- at least concerning the classical marker genes- indistinguishable from similar lineage cells, particularly fibroblasts.MethodsHere we used microarray-based gene expression and global DNA methylation analyses as well as accompanying computational tools in order to specify differences between MSCs and fibroblasts, to further unravel potential identity genes and to highlight MSC signaling pathways with regard to their trophic and immunosuppressive action.ResultsWe identified 1352 differentially expressed genes, of which in the MSCs there is a strong signature for e.g., KRAS signaling, known to play essential role in stemness maintenance, regulation of coagulation and complement being decisive for resolving inflammatory processes, as well as of wound healing particularly important for their regenerative capacity. Genes upregulated in fibroblasts addressed predominately transcription and biosynthetic processes and mapped morphological features of the tissue. Concerning the cellular identity, we specified the already known HOX code for MSCs, established a potential HOX code for fibroblasts, and linked certain HOX genes to functional cell-type-specific properties. Accompanied methylation profiles revealed numerous regions, especially in HOX genes, being differentially methylated, which might provide additional biomarker potential.DiscussionConclusively, transcriptomic together with epigenetic signatures can be successfully be used for the definition (cellular identity) of MSCs versus fibroblasts as well as for the determination of the superior functional properties of MSCs, such as their immunomodulatory potential

    Biomarker Supervised G-CSF (Filgrastim) Response in ALS Patients

    Get PDF
    Objective: To evaluate safety, tolerability and feasibility of long-term treatment with Granulocyte-colony stimulating factor (G-CSF), a well-known hematopoietic stem cell factor, guided by assessment of mobilized bone marrow derived stem cells and cytokines in the serum of patients with amyotrophic lateral sclerosis (ALS) treated on a named patient basis.Methods: 36 ALS patients were treated with subcutaneous injections of G-CSF on a named patient basis and in an outpatient setting. Drug was dosed by individual application schemes (mean 464 Mio IU/month, range 90-2160 Mio IU/month) over a median of 13.7 months (range from 2.7 to 73.8 months). Safety, tolerability, survival and change in ALSFRS-R were observed. Hematopoietic stem cells were monitored by flow cytometry analysis of circulating CD34+ and CD34+CD38− cells, and peripheral cytokines were assessed by electrochemoluminescence throughout the intervention period. Analysis of immunological and hematological markers was conducted.Results: Long term and individually adapted treatment with G-CSF was well tolerated and safe. G-CSF led to a significant mobilization of hematopoietic stem cells into the peripheral blood. Higher mobilization capacity was associated with prolonged survival. Initial levels of serum cytokines, such as MDC, TNF-beta, IL-7, IL-16, and Tie-2 were significantly associated with survival. Continued application of G-CSF led to persistent alterations in serum cytokines and ongoing measurements revealed the multifaceted effects of G-CSF.Conclusions: G-CSF treatment is feasible and safe for ALS patients. It may exert its beneficial effects through neuroprotective and -regenerative activities, mobilization of hematopoietic stem cells and regulation of pro- and anti-inflammatory cytokines as well as angiogenic factors. These cytokines may serve as prognostic markers when measured at the time of diagnosis. Hematopoietic stem cell numbers and cytokine levels are altered by ongoing G-CSF application and may potentially serve as treatment biomarkers for early monitoring of G-CSF treatment efficacy in ALS in future clinical trials

    Modeling and Bioinformatics Identify Responders to G-CSF in Patients With Amyotrophic Lateral Sclerosis

    Get PDF
    Objective: Developing an integrative approach to early treatment response classification using survival modeling and bioinformatics with various biomarkers for early assessment of filgrastim (granulocyte colony stimulating factor) treatment effects in amyotrophic lateral sclerosis (ALS) patients. Filgrastim, a hematopoietic growth factor with excellent safety, routinely applied in oncology and stem cell mobilization, had shown preliminary efficacy in ALS. Methods: We conducted individualized long-term filgrastim treatment in 36 ALS patients. The PRO-ACT database, with outcome data from 23 international clinical ALS trials, served as historical control and mathematical reference for survival modeling. Imaging data as well as cytokine and cellular data from stem cell analysis were processed as biomarkers in a non-linear principal component analysis (NLPCA) to identify individual response. Results: Cox proportional hazard and matched-pair analyses revealed a significant survival benefit for filgrastim-treated patients over PRO-ACT comparators. We generated a model for survival estimation based on patients in the PRO-ACT database and then applied the model to filgrastim-treated patients. Model-identified filgrastim responders displayed less functional decline and impressively longer survival than non-responders. Multimodal biomarkers were then analyzed by PCA in the context of model-defined treatment response, allowing identification of subsequent treatment response as early as within 3 months of therapy. Strong treatment response with a median survival of 3.8 years after start of therapy was associated with younger age, increased hematopoietic stem cell mobilization, less aggressive inflammatory cytokine plasma profiles, and preserved pattern of fractional anisotropy as determined by magnetic resonance diffusion tensor imaging (DTI-MRI). Conclusion: Long-term filgrastim is safe, is well-tolerated, and has significant positive effects on disease progression and survival in a small cohort of ALS patients. Developing and applying a model-based biomarker response classification allows use of multimodal biomarker patterns in full potential. This can identify strong individual treatment responders (here: filgrastim) at a very early stage of therapy and may pave the way to an effective individualized treatment option

    Comparative transcriptomic and proteomic signature of lung alveolar macrophages reveals the integrin CD11b as a regulatory hub during pneumococcal pneumonia infection

    Get PDF
    IntroductionStreptococcus pneumoniae is one of the main causes of community-acquired infections in the lung alveoli in children and the elderly. Alveolar macrophages (AM) patrol alveoli in homeostasis and under infectious conditions. However, the molecular adaptations of AM upon infections with Streptococcus pneumoniae are incompletely resolved.MethodsWe used a comparative transcriptomic and proteomic approach to provide novel insights into the cellular mechanism that changes the molecular signature of AM during lung infections. Using a tandem mass spectrometry approach to murine cell-sorted AM, we revealed significant proteomic changes upon lung infection with Streptococcus pneumoniae.ResultsAM showed a strong neutrophil-associated proteomic signature, such as expression of CD11b, MPO, neutrophil gelatinases, and elastases, which was associated with phagocytosis of recruited neutrophils. Transcriptomic analysis indicated intrinsic expression of CD11b by AM. Moreover, comparative transcriptomic and proteomic profiling identified CD11b as the central molecular hub in AM, which influenced neutrophil recruitment, activation, and migration.DiscussionIn conclusion, our study provides novel insights into the intrinsic molecular adaptations of AM upon lung infection with Streptococcus pneumoniae and reveals profound alterations critical for effective antimicrobial immunity

    SeqFeatR for the Discovery of Feature-Sequence Associations.

    No full text
    Specific selection pressures often lead to specifically mutated genomes. The open source software SeqFeatR has been developed to identify associations between mutation patterns in biological sequences and specific selection pressures ("features"). For instance, SeqFeatR has been used to discover in viral protein sequences new T cell epitopes for hosts of given HLA types. SeqFeatR supports frequentist and Bayesian methods for the discovery of statistical sequence-feature associations. Moreover, it offers novel ways to visualize results of the statistical analyses and to relate them to further properties. In this article we demonstrate various functions of SeqFeatR with real data. The most frequently used set of functions is also provided by a web server. SeqFeatR is implemented as R package and freely available from the R archive CRAN (http://cran.r-project.org/web/packages/SeqFeatR/index.html). The package includes a tutorial vignette. The software is distributed under the GNU General Public License (version 3 or later). The web server URL is https://seqfeatr.zmb.uni-due.de

    Odds-ratio plot and Tartan plot for visualization of statistical associations.

    No full text
    <p><b>A</b> Odds-ratio plot, based on an alignment of region of HIV-1 gp120 around the V3 loop (C296-C331). Here, the feature is the predicted co-receptor tropism of HIV-1 [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146409#pone.0146409.ref017" target="_blank">17</a>] (R5 vs. X4 tropic). Bar heights and colors indicate logarithms of odds ratios and negative logarithms of <i>p</i> values, respectively. A reference sequence and sequence positions can be added in the top and bottom rows for orientation. <b>B</b> Tartan plot for the synopsis of two alignment pair association measures, here: −log <i>p</i> from association test between alignment position pairs (upper right triangle) vs. Direct Information between these pairs (lower left triangle). Association strengths are color coded (color legend on the right). For orientation, axes can be annotated and sequence substructures can be indicated by lines.</p

    Comparison of statistical indicators of association.

    No full text
    <p>200 random contingency tables with total count <i>N</i> = 100, a typical order of magnitude for analyses of sequence-feature association in practice, are analyzed by Fisher’s exact test, yielding <i>p</i> values for the rejection of independence (horizontal axis, not corrected for multiple testing), and by four different BF models, namely <i>K</i> = 1, <i>K</i> = 100, <i>K</i><sub><i>D</i></sub>, and uniform model, with corresponding BFs on vertical axis. Solid horizontal black line at <i>BF</i> = 1 and dashed vertical line at <i>p</i> = 0.05 for orientation.</p

    Comparison of frequentist approach and Bayes factors (BF).

    No full text
    <p>Discovery of association of alignment positions of HBV core proteins with patient HLA types, here: A*01 (top row) and B*44 (bottom row). Sequence numbers in panel titles are feature-carrying fractions of the total of 148 sequences included in the alignment. Association of sequences with feature HLA were analyzed by Fisher’s exact test (panels A, D), BF with <i>K</i> = 1 (panels B, E), and BF with <i>K</i><sub><i>D</i></sub> (panels C, F). Alignment positions with association above certain thresholds (horizontal dashed lines) are marked by red stars and vertical dashed lines, namely <i>p</i> < 0.01 (A, D), or <i>BF</i> > 10 (B, C, E, F). The <i>p</i> values and BFs shown are the best for each alignment position (lowest <i>p</i> values, highest <i>BF</i>s).</p
    corecore